21 research outputs found

    High Performance Computing using Infiniband-based clusters

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Analysis and optimization of synchronization algorithms for multicore architectures

    Get PDF
    Multicore design is a major issue in modern computer architectures. Programmers are urged to design innovative algorithms by exploiting multicore facilities. Since synchronization affects the performance of multithread algorithms, the selection of an effective synchronization mechanism is critical for multicore environments. Modern computers provide special hardware instructions that allow to atomically read and modify the content of a word (e.g., the cmpxchg instruction in Intel x86 CPUs), so they can be used for synchronization of threads. Moreover, software techniques can synchronize threads without any dependency on hardware instructions. This study considers the main synchronization techniques, such as Ticket lock, which guarantees fairness execution to all threads, Filter lock, which is intended for multiple threads, Readers-writer lock, which aims to solve the readers-writers problem and Read-Copy Update (RCU), which reduces the overhead in readers-writer lock. The first contribution of this study is to evaluate the costs of the mentioned synchronization techniques, due for example to memory access, system call, and spinning, i.e., the act of querying (or in some cases modifying) an object in memory and waiting for its content. In order to reduce the costs of the mentioned synchronization mechanisms, state-of-the-art approaches exploit hardware or software techniques. The second contribution of this study is the analysis of both hardware and software solutions to reduce the synchronization costs. Moreover, a comparative study to highlight benefits and drawbacks of the different synchronization mechanisms has been performed. Different software solutions such as backoff, a waiting time to reduce the bus traffic, non blocking algorithm, a synchronization mechanism without blocking primitives, and compiler barrier, a compiler directive to avoid reordering of the instructions, are well-known techniques which are investigated in this study. Beside software solutions, hardware manufacturers introduce various facilities in shared memory or distributed environment to enhance the performance of synchronization mechanisms. Examples of hardware solutions are hardware message passing and different layer of caches in the shared memory environment, and Remote Direct Memory Access (RDMA) in distributed environment. Experimental benchmarks have been executed on a node of cluster (Opteron 6276 2.3 GHz CPU with 16 cores and running CentOS 6.3 Operating System2). The experiments, which are intended to represent a useful aid for researchers and practitioners interested in optimization of parallel algorithms, show that: 1. The update rate directly impacts on performance, even if a non blocking algorithm is exploited. 2. The cost of keeping data locality should not exceed the cost of cache misses. 3. Exploiting a non blocking synchronization algorithm (i.e., RCU) leads to a better performance. 4. Critical section length should be reduced as much as possible in order to increase the performance. 5. In order to reduce the bus traffic, it is better to avoid spinning. 6. Hardware message passing can increase the performance of shared memory synchronization model. 7. Synchronization methods with heavy instructions should be avoided

    A Review on Fall Prediction and Prevention System for Personal Devices: Evaluation and Experimental Results

    Get PDF
    Injuries due to unintentional falls cause high social cost in which several systems have been developed to reduce them. Recently, two trends can be recognized. Firstly, the market is dominated by fall detection systems, which activate an alarm after a fall occurrence, but the focus is moving towards predicting and preventing a fall, as it is the most promising approach to avoid a fall injury. Secondly, personal devices, such as smartphones, are being exploited for implementing fall systems, because they are commonly carried by the user most of the day. This paper reviews various fall prediction and prevention systems, with a particular interest to the ones that can rely on the sensors embedded in a smartphone, i.e., accelerometer and gyroscope. Kinematic features obtained from the data collected from accelerometer and gyroscope have been evaluated in combination with different machine learning algorithms. An experimental analysis compares the evaluated approaches by evaluating their accuracy and ability to predict and prevent a fall. Results show that tilt features in combination with a decision tree algorithm present the best performance

    Analysis and optimization of synchronization algorithms for multicore architectures

    Get PDF
    Multicore design is a major issue in modern computer architectures. Programmers are urged to design innovative algorithms by exploiting multicore facilities. Since synchronization affects the performance of multithread algorithms, the selection of an effective synchronization mechanism is critical for multicore environments. Modern computers provide special hardware instructions that allow to atomically read and modify the content of a word (e.g., the cmpxchg instruction in Intel x86 CPUs), so they can be used for synchronization of threads. Moreover, software techniques can synchronize threads without any dependency on hardware instructions. This study considers the main synchronization techniques, such as Ticket lock, which guarantees fairness execution to all threads, Filter lock, which is intended for multiple threads, Readers-writer lock, which aims to solve the readers-writers problem and Read-Copy Update (RCU), which reduces the overhead in readers-writer lock. The first contribution of this study is to evaluate the costs of the mentioned synchronization techniques, due for example to memory access, system call, and spinning, i.e., the act of querying (or in some cases modifying) an object in memory and waiting for its content. In order to reduce the costs of the mentioned synchronization mechanisms, state-of-the-art approaches exploit hardware or software techniques. The second contribution of this study is the analysis of both hardware and software solutions to reduce the synchronization costs. Moreover, a comparative study to highlight benefits and drawbacks of the different synchronization mechanisms has been performed. Different software solutions such as backoff, a waiting time to reduce the bus traffic, non blocking algorithm, a synchronization mechanism without blocking primitives, and compiler barrier, a compiler directive to avoid reordering of the instructions, are well-known techniques which are investigated in this study. Beside software solutions, hardware manufacturers introduce various facilities in shared memory or distributed environment to enhance the performance of synchronization mechanisms. Examples of hardware solutions are hardware message passing and different layer of caches in the shared memory environment, and Remote Direct Memory Access (RDMA) in distributed environment. Experimental benchmarks have been executed on a node of cluster (Opteron 6276 2.3 GHz CPU with 16 cores and running CentOS 6.3 Operating System2). The experiments, which are intended to represent a useful aid for researchers and practitioners interested in optimization of parallel algorithms, show that: 1. The update rate directly impacts on performance, even if a non blocking algorithm is exploited. 2. The cost of keeping data locality should not exceed the cost of cache misses. 3. Exploiting a non blocking synchronization algorithm (i.e., RCU) leads to a better performance. 4. Critical section length should be reduced as much as possible in order to increase the performance. 5. In order to reduce the bus traffic, it is better to avoid spinning. 6. Hardware message passing can increase the performance of shared memory synchronization model. 7. Synchronization methods with heavy instructions should be avoided

    Nonlinear predictive threshold model for real-time abnormal gait detection

    Get PDF
    Falls are critical events for human health due to the associated risk of physical and psychological injuries. Several fall related systems have been developed in order to reduce injuries. Among them, fall-risk prediction systems are one of the most promising approaches, as they strive to predict a fall before its occurrence. A category of fall-risk prediction systems evaluates balance and muscle strength through some clinical functional assessment tests, while other prediction systems investigate the recognition of abnormal gait patterns to predict a fall in real-time. The main contribution of this paper is a nonlinear model of user gait in combination with a threshold-based classification in order to recognize abnormal gait patterns with low complexity and high accuracy. In addition, a dataset with realistic parameters is prepared to simulate abnormal walks and to evaluate fall prediction methods. The accelerometer and gyroscope sensors available in a smartphone have been exploited to create the dataset. The proposed approach has been implemented and compared with the state-of-the-art approaches showing that it is able to predict an abnormal walk with a higher accuracy (93.5%) and a higher efficiency (up to 3.5 faster) than other feasible approaches

    Internet of Things for fall prediction and prevention

    Get PDF
    Internet of Things (IoT) is making a breakthrough for the development of innovative healthcare systems. IoT-based health applications are expected to change the paradigm traditionally followed by physicians for diagnosis, by moving health monitoring from the clinical environment to the domestic space. Fall avoidance is a field where the continuous monitoring allowed by the IoT-based framework offers tremendous benefits to the user. In fact, falls are highly damaging due to both physical and psychological injuries. Currently, the most promising approaches to reduce fall injuries are fall prediction, which strives to predict a fall before its occurrence, and fall prevention, which assesses balance and muscle strength through some clinical functional tests. In this context, the IoT-based framework provides real-time emergency notification as soon as fall is predicted, mid-term analysis on the monitored activities, and data logging for long-term analysis by clinical experts. This approach gives more information to experts for estimating the risk of a future fall and for suggesting proper exercises

    In-network monitoring strategies for HPC cloud

    No full text
    Accepted for AINA 2024.</p

    Communicating Efficiently on Cluster-Based Remote Direct Memory Access (RDMA) over InfiniBand Protocol

    Get PDF
    Distributed systems are commonly built under the assumption that the network is the primary bottleneck, however this assumption no longer holds by emerging high-performance RDMA enabled protocols in datacenters. Designing distributed applications over such protocols requires a fundamental rethinking in communication components in comparison with traditional protocols (i.e., TCP/IP). In this paper, communication paradigms in existing systems and new possible paradigms have been investigated. Advantages and drawbacks of each paradigm have been comprehensively analyzed and experimentally evaluated. The experimental results show that writing the requests to server and reading the response presents up to 10 times better performance comparing to other communication paradigms. To further expand the investigation, the proposed communication paradigm has been substituted in a real-world distributed application, and the performance has been enhanced up to seven times

    Analyzing In-Memory NoSQL Landscape

    No full text
    In-memory key-value stores have quickly become a key enabling technology to build high-performance applications that must cope with massively distributed workloads. In-memory key-value stores (also referred to as NoSQL) primarly aim to offer low-latency and high-throughput data access which motivates the rapid adoption of modern network cards such as Remote Direct Memory Access (RDMA). In this paper, we present the fundamental design principles for exploiting RDMAs in modern NoSQL systems. Moreover, we describe a break-down analysis of the state-of-the-art of the RDMA-based in-memory NoSQL systems regarding the indexing, data consistency, and the communication protocol. In addition, we compare traditional in-memory NoSQL with their RDMA-enabled counterparts. Finally, we present a comprehensive analysis and evaluation of the existing systems according to the impact of the number of clients, real-world request distributions, and workload read-write ratios

    Eigenwalk: a Novel Feature for Walk Classification and Fall Prediction

    No full text
    Predicting a fall is one of the most promising approaches to avoid it. Different studies strive to classify abnormal and normal walks in order to predict a fall before its occurrence. This study introduces eigenwalk, a novel feature based on the principal components of the accelerometer and gyroscope signals. This feature, in conjunction with a random forest classifier, is able to distinguish walk patterns and to estimate a fall risk. As the accelerometer and the gyroscope embedded in a smartphone are recognized to be precise enough for fall avoidance systems, they have been exploited in an experimental analysis in order to compare the proposed approach with the most recent ones. The results have shown that the new feature in combination with the random forest classification outperforms state-of-the-art approaches, by improving the accuracy up to 98.6%
    corecore